Identification and Correction of Sample Mix-Ups in Expression Genetic Data: A Case Study

نویسندگان

  • Karl W Broman
  • Mark P Keller
  • Aimee Teo Broman
  • Christina Kendziorski
  • Brian S Yandell
  • Śaunak Sen
  • Alan D Attie
چکیده

In a mouse intercross with more than 500 animals and genome-wide gene expression data on six tissues, we identified a high proportion (18%) of sample mix-ups in the genotype data. Local expression quantitative trait loci (eQTL; genetic loci influencing gene expression) with extremely large effect were used to form a classifier to predict an individual's eQTL genotype based on expression data alone. By considering multiple eQTL and their related transcripts, we identified numerous individuals whose predicted eQTL genotypes (based on their expression data) did not match their observed genotypes, and then went on to identify other individuals whose genotypes did match the predicted eQTL genotypes. The concordance of predictions across six tissues indicated that the problem was due to mix-ups in the genotypes (although we further identified a small number of sample mix-ups in each of the six panels of gene expression microarrays). Consideration of the plate positions of the DNA samples indicated a number of off-by-one and off-by-two errors, likely the result of pipetting errors. Such sample mix-ups can be a problem in any genetic study, but eQTL data allow us to identify, and even correct, such problems. Our methods have been implemented in an R package, R/lineup.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MixupMapper: correcting sample mix-ups in genome-wide datasets increases power to detect small genetic effects

MOTIVATION Sample mix-ups can arise during sample collection, handling, genotyping or data management. It is unclear how often sample mix-ups occur in genome-wide studies, as there currently are no post hoc methods that can identify these mix-ups in unrelated samples. We have therefore developed an algorithm (MixupMapper) that can both detect and correct sample mix-ups in genome-wide studies th...

متن کامل

A Hybrid Business Success Versus Failure Classification Prediction Model: A Case of Iranian Accelerated Start-ups

The purpose of this study is to reduce the uncertainty of early stage startups success prediction and filling the gap of previous studies in the field, by identifying and evaluating the success variables and developing a novel business success failure (S/F) data mining classification prediction model for Iranian start-ups. For this purpose, the paper is seeking to extend Bill Gross and Robert L...

متن کامل

Identification and Functional Prediction of Long Non-Coding RNAs Responsive to Drought stress in Lens culinaris L.

Drought stress is one of the main environmental factors that affects growth and productivity of crop plants, including lentil. In the course of evolution evolution, crucial genetic regulations mediated by non-coding RNAs (ncRNAs) have emerged in plant in response to drought and other abiotic stresses. In the present study, after identifying lncRNAs within the expression profile of lentil, RNA-s...

متن کامل

investigate and Classify the Most Important Effective Components of P7 Marketing Mix on Demand for handmade Carpets (Case Study of Sistan Handmade Carpets)

Sistan's handmade carpet is a legacy left from authentic Iranian culture and tradition which is now in decline. The decline in sale has lead to reduction in production of these carpets. The purpose of this study is investigating the status of the marketing process based on the 7P model (product, price, place, promotion, people, process, physical evidence) in Sistan handmade carpet industry. Thi...

متن کامل

Marketing Mix Elements - A Case Study on Steel Industry Export

S teel industries play a key role in the national economy and welfare of the society in many steel manufacturer countries. It is found that manufacturing and consuming of steel products would be a key indicator to measure and evaluate economic and industrial performance of a country. Nowadays, countries with the large natural oil and gas resources (e.g. Iran) attempt to select an alt...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2015